9.6 A Wire-Delay Scalable Microprocessor Architecture for High Performance Systems

نویسندگان

  • Stephen W. Keckler
  • Doug Burger
  • Charles R. Moore
  • Ramadass Nagarajan
  • Karthikeyan Sankaralingam
  • Vikas Agarwal
  • Nitya Ranganathan
  • Premkishore Shivakumar
چکیده

While microprocessor pipeline depths have increased dramatically over the last decade, they are fast approaching their optimal depth. As shown in Figure 9.6.1, the number of logic levels in modern processors is nearing 10 fanoutof-4 (FO4) inverter delay. Substantial further reductions will be undesirable due to pipeline overheads and power consumption [1]. Technology trends also show that global on-chip wire delays are growing significantly, eventually increasing cross-chip communication latencies to tens of cycles and rendering the expected chip area reachable in a single cycle to be less than 1% in a 35nm technology, as shown in Figure 9.6.2. The challenge for architects is to design new architectures that achieve both a fast clock rate (low FO4) and high concurrency, despite slow global wires. Because existing superscalar microarchitectures rely on global communication, they are poorly matched to the technology challenges of the coming decade.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Design of a novel congestion-aware communication mechanism for wireless NoC architecture in multicore systems

Hybrid Wireless Network-on-Chip (WNoC) architecture is emerged as a scalable communication structure to mitigate the deficits of traditional NOC architecture for the future Multi-core systems. The hybrid WNoC architecture provides energy efficient, high data rate and flexible communications for NoC architectures. In these architectures, each wireless router is shared by a set of processing core...

متن کامل

Versatility and VersaBench: A New Metric and a Benchmark Suite for Flexible Architectures

With the increasing miniaturization of transistors, wire delay and power consumption are emerging as the most formidable barriers to the scalability of microprocessors. Overcoming these barriers requires a fundamental rethinking of both microprocessor design and the programming models they support. Toward the former, new architecture designs are focusing on scalable and distributed alternatives...

متن کامل

Modeling technology impact on cluster microprocessor performance

The growing speed gap between transistors and wire interconnects is forcing the development of distributed, or clustered, architectures. These designs partition the chip into small regions with fast intracluster communication. Longer latency is required to communicate between clusters. The hardware and/or software are responsible for scheduling instructions to clusters such that critical path c...

متن کامل

TITAC-2: An asynchronous 32-bit microprocessor based on Scalable-Delay-Insensitive model

Asynchronous design has a potential of solving many difficulties, such as clock skew and power consumption, which synchronous counterpart suffers with current and future VLSI technologies. This paper proposes a new delay model, the scalable-delay-insensitive (SDI) model, for dependable and high-performance asynchronous VLSI system design. Then, based on the SDI model, the paper presents the des...

متن کامل

Performance Limits Due to Inter-Cluster Data Forwarding in Wire-Limited ILP Microprocessors

The growing speed gap between transistors and wire interconnects is forcing the development of distributed, or clustered, architectures. These designs partition the chip into small regions with fast intra-cluster communication. Longer latency is required to communicate between clusters. The hardware and/or software is responsible for scheduling instructions to clusters such that critical path c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997